| Number of Variables | 15 |
|---|---|
| Number of Rows | 3.6328e+06 |
| Missing Cells | 0 |
| Missing Cells (%) | 0.0% |
| Duplicate Rows | 712077 |
| Duplicate Rows (%) | 19.6% |
| Total Size in Memory | 1.1 GB |
| Average Row Size in Memory | 322.1 B |
| Variable Types |
|
| jaro_distance is skewed | Skewed |
|---|---|
| jaro_winkler_distance is skewed | Skewed |
| overlap_coefficient_distance is skewed | Skewed |
| generalized_jaccard_distance is skewed | Skewed |
| tfidf_distance is skewed | Skewed |
| soft_tfidf_distance is skewed | Skewed |
| Dataset has 712077 (19.6%) duplicate rows | Duplicates |
| ProductID has a high cardinality: 1709 distinct values | High Cardinality |
| ProductID2 has a high cardinality: 1709 distinct values | High Cardinality |
| same_product has constant length 1 | Constant Length |
| jaro_winkler_distance has 1179320 (32.46%) zeros | Zeros |
|---|
categorical
| Approximate Distinct Count | 1709 |
|---|---|
| Approximate Unique (%) | 0.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory Size | 425177138 |
| Mean | 52.0373 |
|---|---|
| Standard Deviation | 20.4398 |
| Median | 70 |
| Minimum | 11 |
| Maximum | 120 |
| 1st row | amd ryzen 5 1600 b... |
|---|---|
| 2nd row | amd ryzen 5 1600 b... |
| 3rd row | amd ryzen 5 1600 b... |
| 4th row | amd ryzen 5 1600 b... |
| 5th row | amd ryzen 5 1600 b... |
| Count | 105138772 |
|---|---|
| Lowercase Letter | 105138772 |
| Space Separator | 29853678 |
| Uppercase Letter | 0 |
| Dash Punctuation | 0 |
| Decimal Number | 51330486 |
categorical
| Approximate Distinct Count | 1709 |
|---|---|
| Approximate Unique (%) | 0.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory Size | 425177138 |
| Mean | 52.0373 |
|---|---|
| Standard Deviation | 20.4398 |
| Median | 50 |
| Minimum | 11 |
| Maximum | 120 |
| 1st row | amd ryzen 5 1600 b... |
|---|---|
| 2nd row | amd ryzen 5 1600 |
| 3rd row | amd ryzen 5 1600 b... |
| 4th row | amd ryzen 5 1600 y... |
| 5th row | amd ryzen 5 1600 b... |
| Count | 105138772 |
|---|---|
| Lowercase Letter | 105138772 |
| Space Separator | 29853678 |
| Uppercase Letter | 0 |
| Dash Punctuation | 0 |
| Decimal Number | 51330486 |
categorical
| Approximate Distinct Count | 2 |
|---|---|
| Approximate Unique (%) | 0.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory Size | 239767176 |
| Mean | 1 |
|---|---|
| Standard Deviation | 0 |
| Median | 1 |
| Minimum | 1 |
| Maximum | 1 |
| 1st row | 1 |
|---|---|
| 2nd row | 1 |
| 3rd row | 1 |
| 4th row | 1 |
| 5th row | 1 |
| Count | 0 |
|---|---|
| Lowercase Letter | 0 |
| Space Separator | 0 |
| Uppercase Letter | 0 |
| Dash Punctuation | 0 |
| Decimal Number | 3632836 |
numerical
| Approximate Distinct Count | 3008 |
|---|---|
| Approximate Unique (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Memory Size | 58125376 |
| Mean | 0.7456 |
| Minimum | 0 |
| Maximum | 1 |
| Zeros | 2428 |
| Zeros (%) | 0.1% |
| Negatives | 0 |
| Negatives (%) | 0.0% |
| Minimum | 0 |
|---|---|
| 5-th Percentile | 0.5312 |
| Q1 | 0.6977 |
| Median | 0.7746 |
| Q3 | 0.8256 |
| 95-th Percentile | 0.8795 |
| Maximum | 1 |
| Range | 1 |
| IQR | 0.1279 |
| Mean | 0.7456 |
|---|---|
| Standard Deviation | 0.1179 |
| Variance | 0.01391 |
| Sum | 2.7088e+06 |
| Skewness | -1.7739 |
| Kurtosis | 5.064 |
| Coefficient of Variation | 0.1582 |
numerical
| Approximate Distinct Count | 8341 |
|---|---|
| Approximate Unique (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Memory Size | 58125376 |
| Mean | 0.9199 |
| Minimum | 0 |
| Maximum | 1.3971 |
| Zeros | 2428 |
| Zeros (%) | 0.1% |
| Negatives | 0 |
| Negatives (%) | 0.0% |
| Minimum | 0 |
|---|---|
| 5-th Percentile | 0.622 |
| Q1 | 0.8302 |
| Median | 0.9273 |
| Q3 | 1.0326 |
| 95-th Percentile | 1.2021 |
| Maximum | 1.3971 |
| Range | 1.3971 |
| IQR | 0.2024 |
| Mean | 0.9199 |
|---|---|
| Standard Deviation | 0.1779 |
| Variance | 0.03164 |
| Sum | 3.3417e+06 |
| Skewness | -0.7324 |
| Kurtosis | 2.0668 |
| Coefficient of Variation | 0.1934 |
numerical
| Approximate Distinct Count | 238318 |
|---|---|
| Approximate Unique (%) | 6.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Memory Size | 58125376 |
| Mean | 0.8223 |
| Minimum | 0 |
| Maximum | 1.1382 |
| Zeros | 2428 |
| Zeros (%) | 0.1% |
| Negatives | 0 |
| Negatives (%) | 0.0% |
| Minimum | 0 |
|---|---|
| 5-th Percentile | 0.5695 |
| Q1 | 0.7568 |
| Median | 0.8455 |
| Q3 | 0.9154 |
| 95-th Percentile | 1.0181 |
| Maximum | 1.1382 |
| Range | 1.1382 |
| IQR | 0.1586 |
| Mean | 0.8223 |
|---|---|
| Standard Deviation | 0.1431 |
| Variance | 0.02049 |
| Sum | 2.9874e+06 |
| Skewness | -1.2684 |
| Kurtosis | 3.2385 |
| Coefficient of Variation | 0.1741 |
numerical
| Approximate Distinct Count | 2896 |
|---|---|
| Approximate Unique (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Memory Size | 58125376 |
| Mean | 0.8129 |
| Minimum | 0 |
| Maximum | 0.98 |
| Zeros | 2428 |
| Zeros (%) | 0.1% |
| Negatives | 0 |
| Negatives (%) | 0.0% |
| Minimum | 0 |
|---|---|
| 5-th Percentile | 0.6087 |
| Q1 | 0.7755 |
| Median | 0.8427 |
| Q3 | 0.8864 |
| 95-th Percentile | 0.9315 |
| Maximum | 0.98 |
| Range | 0.98 |
| IQR | 0.1109 |
| Mean | 0.8129 |
|---|---|
| Standard Deviation | 0.1135 |
| Variance | 0.01289 |
| Sum | 2.9532e+06 |
| Skewness | -2.2734 |
| Kurtosis | 8.1729 |
| Coefficient of Variation | 0.1397 |
numerical
| Approximate Distinct Count | 327256 |
|---|---|
| Approximate Unique (%) | 9.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Memory Size | 58125376 |
| Mean | 0.9899 |
| Minimum | 0.9091 |
| Maximum | 0.9969 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negatives | 0 |
| Negatives (%) | 0.0% |
| Minimum | 0.9091 |
|---|---|
| 5-th Percentile | 0.9826 |
| Q1 | 0.9886 |
| Median | 0.991 |
| Q3 | 0.9924 |
| 95-th Percentile | 0.994 |
| Maximum | 0.9969 |
| Range | 0.08776 |
| IQR | 0.003787 |
| Mean | 0.9899 |
|---|---|
| Standard Deviation | 0.004094 |
| Variance | 1.6762e-05 |
| Sum | 3.596e+06 |
| Skewness | -2.9449 |
| Kurtosis | 18.7039 |
| Coefficient of Variation | 0.004136 |
numerical
| Approximate Distinct Count | 170247 |
|---|---|
| Approximate Unique (%) | 4.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Memory Size | 58125376 |
| Mean | 0.3004 |
| Minimum | 0 |
| Maximum | 0.6941 |
| Zeros | 1179320 |
| Zeros (%) | 32.5% |
| Negatives | 0 |
| Negatives (%) | 0.0% |
| Minimum | 0 |
|---|---|
| 5-th Percentile | 0 |
| Q1 | 0 |
| Median | 0.4044 |
| Q3 | 0.4684 |
| 95-th Percentile | 0.5359 |
| Maximum | 0.6941 |
| Range | 0.6941 |
| IQR | 0.4684 |
| Mean | 0.3004 |
|---|---|
| Standard Deviation | 0.2145 |
| Variance | 0.046 |
| Sum | 1.0913e+06 |
| Skewness | -0.5715 |
| Kurtosis | -1.4456 |
| Coefficient of Variation | 0.714 |
numerical
| Approximate Distinct Count | 121 |
|---|---|
| Approximate Unique (%) | 0.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Memory Size | 58125376 |
| Mean | 0.8032 |
| Minimum | 0 |
| Maximum | 1 |
| Zeros | 9598 |
| Zeros (%) | 0.3% |
| Negatives | 0 |
| Negatives (%) | 0.0% |
| Minimum | 0 |
|---|---|
| 5-th Percentile | 0.5 |
| Q1 | 0.7 |
| Median | 0.8333 |
| Q3 | 1 |
| 95-th Percentile | 1 |
| Maximum | 1 |
| Range | 1 |
| IQR | 0.3 |
| Mean | 0.8032 |
|---|---|
| Standard Deviation | 0.1833 |
| Variance | 0.03358 |
| Sum | 2.918e+06 |
| Skewness | -0.9517 |
| Kurtosis | 0.9156 |
| Coefficient of Variation | 0.2281 |
numerical
| Approximate Distinct Count | 252 |
|---|---|
| Approximate Unique (%) | 0.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Memory Size | 58125376 |
| Mean | 0.9109 |
| Minimum | 0 |
| Maximum | 1 |
| Zeros | 2528 |
| Zeros (%) | 0.1% |
| Negatives | 0 |
| Negatives (%) | 0.0% |
| Minimum | 0 |
|---|---|
| 5-th Percentile | 0.75 |
| Q1 | 0.8696 |
| Median | 0.9333 |
| Q3 | 1 |
| 95-th Percentile | 1 |
| Maximum | 1 |
| Range | 1 |
| IQR | 0.1304 |
| Mean | 0.9109 |
|---|---|
| Standard Deviation | 0.09603 |
| Variance | 0.009222 |
| Sum | 3.3091e+06 |
| Skewness | -2.2154 |
| Kurtosis | 10.2656 |
| Coefficient of Variation | 0.1054 |
numerical
| Approximate Distinct Count | 2930 |
|---|---|
| Approximate Unique (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Memory Size | 58125376 |
| Mean | 0.9464 |
| Minimum | 0 |
| Maximum | 1 |
| Zeros | 2430 |
| Zeros (%) | 0.1% |
| Negatives | 0 |
| Negatives (%) | 0.0% |
| Minimum | 0 |
|---|---|
| 5-th Percentile | 0.8385 |
| Q1 | 0.9235 |
| Median | 0.9629 |
| Q3 | 1 |
| 95-th Percentile | 1 |
| Maximum | 1 |
| Range | 1 |
| IQR | 0.07647 |
| Mean | 0.9464 |
|---|---|
| Standard Deviation | 0.06703 |
| Variance | 0.004492 |
| Sum | 3.4382e+06 |
| Skewness | -4.0795 |
| Kurtosis | 37.023 |
| Coefficient of Variation | 0.07082 |
numerical
| Approximate Distinct Count | 1915827 |
|---|---|
| Approximate Unique (%) | 52.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Memory Size | 58125376 |
| Mean | 0.9927 |
| Minimum | 0.9091 |
| Maximum | 1 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negatives | 0 |
| Negatives (%) | 0.0% |
| Minimum | 0.9091 |
|---|---|
| 5-th Percentile | 0.9867 |
| Q1 | 0.9914 |
| Median | 0.9933 |
| Q3 | 0.9949 |
| 95-th Percentile | 0.9969 |
| Maximum | 1 |
| Range | 0.09091 |
| IQR | 0.003497 |
| Mean | 0.9927 |
|---|---|
| Standard Deviation | 0.003567 |
| Variance | 1.2722e-05 |
| Sum | 3.6062e+06 |
| Skewness | -2.6665 |
| Kurtosis | 20.0914 |
| Coefficient of Variation | 0.003593 |
numerical
| Approximate Distinct Count | 96 |
|---|---|
| Approximate Unique (%) | 0.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Memory Size | 58125376 |
| Mean | 0.582 |
| Minimum | 0 |
| Maximum | 0.95 |
| Zeros | 5824 |
| Zeros (%) | 0.2% |
| Negatives | 0 |
| Negatives (%) | 0.0% |
| Minimum | 0 |
|---|---|
| 5-th Percentile | 0.32 |
| Q1 | 0.52 |
| Median | 0.61 |
| Q3 | 0.68 |
| 95-th Percentile | 0.76 |
| Maximum | 0.95 |
| Range | 0.95 |
| IQR | 0.16 |
| Mean | 0.582 |
|---|---|
| Standard Deviation | 0.1374 |
| Variance | 0.01888 |
| Sum | 2.1141e+06 |
| Skewness | -1.0744 |
| Kurtosis | 1.5815 |
| Coefficient of Variation | 0.2361 |
numerical
| Approximate Distinct Count | 3201 |
|---|---|
| Approximate Unique (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Memory Size | 58125376 |
| Mean | 0.5367 |
| Minimum | 0 |
| Maximum | 0.9254 |
| Zeros | 2434 |
| Zeros (%) | 0.1% |
| Negatives | 0 |
| Negatives (%) | 0.0% |
| Minimum | 0 |
|---|---|
| 5-th Percentile | 0.3108 |
| Q1 | 0.4382 |
| Median | 0.5362 |
| Q3 | 0.64 |
| 95-th Percentile | 0.7778 |
| Maximum | 0.9254 |
| Range | 0.9254 |
| IQR | 0.2018 |
| Mean | 0.5367 |
|---|---|
| Standard Deviation | 0.1428 |
| Variance | 0.02038 |
| Sum | 1.9496e+06 |
| Skewness | -0.05909 |
| Kurtosis | -0.2144 |
| Coefficient of Variation | 0.266 |